13 research outputs found

    Classification of Emotion and Polarity from Twitter Data

    Get PDF
    Classification of public information from microblogging and social networking services could yield interesting outcomes and insights into the social public opinions towards different services and products. Microblogging and social networking data is one of the most helpful and proper indicators of public opinion. Hence, in this research real time Twitter microblogging data towards iPad and iPhone have been collected in different locations in order to analyse and classify data in terms of polarity: positive or negative, and emotion: anger, joy, sadness, disgust, fear, and surprise. After that the collected tweets have been pre-processed to generate document level ground truth. Supervised machine learning algorithms have been used to classify tweets to their classes using cross validation and partitioning the data across cities. The performance measures of the classifiers have been considered to identify suitable algorithm for the data sets. It was found that the K-NN, Naïve Bayes, and SVM have a reasonable accuracy rates, however, the K-NN has outperformed the Naïve Bayes, SVM, and ZeroR based on the achieved accuracy rates and trained model time. The K-NN has achieved the highest accuracy rates 96.58 % and 99.94 % for the iPad and iPhone emotion data sets using cross validation technique respectively. Regarding partitioning the data per city, the K-NN has achieved the highest accuracy rates 98.8% and 99.95% for the iPad and iPhone emotion data sets respectively. Regarding the polarity data sets using both cross validation and partitioning data per city the K-NN achieved 100% for the all polarity data sets

    Sequential learning and shared representation for sensor-based human activity recognition

    Get PDF
    Human activity recognition based on sensor data has rapidly attracted considerable research attention due to its wide range of applications including senior monitoring, rehabilitation, and healthcare. These applications require accurate systems of human activity recognition to track and understand human behaviour. Yet, developing such accurate systems pose critical challenges and struggle to learn from temporal sequential sensor data due to the variations and complexity of human activities. The main challenges of developing human activity recognition are accuracy and robustness due to the diversity and similarity of human activities, skewed distribution of human activities, and also lack of a rich quantity of wellcurated human activity data. This thesis addresses these challenges by developing robust deep sequential learning models to boost the performance of human activity recognition and handle the imbalanced class problems as well as reduce the need for a large amount of annotated data. This thesis develops a set of new networks specifically designed for the challenges in building better HAR systems compared to the existing methods. First, this thesis proposes robust and sequential deep learning models to accurately recognise human activities and boost the performance of the human activity recognition systems against the current methods from smart home and wearable sensors collected data. The proposed methods integrate convolutional neural networks and different attention mechanisms to efficiently process human activity data and capture significant information for recognising human activities. Next, the thesis proposes methods to address the imbalanced class problems for human activity recognition systems. Joint learning of sequential deep learning algorithms, i.e., long short-term memory and convolutional neural networks is proposed to boost the performance of human activity recognition, particularly for infrequent human activities. In addition to that, also propose a data-level solution to address imbalanced class problems by extending the synthetic minority over-sampling technique (SMOTE) which we named (iSMOTE) to accurately label the generated synthetic samples. These methods have enhanced the results of the minority human activities and outperformed the current state-of-the-art methods. In this thesis, sequential deep learning networks are proposed to boost the performance of human activity recognition in addition to reducing the dependency for a rich quantity of well-curated human activity data by transfer learning techniques. A multi-domain learning network is proposed to process data from multi-domains, transfer knowledge across different but related domains of human activities and mitigate isolated learning paradigms using a shared representation. The advantage of the proposed method is firstly to reduce the need and effort for labelled data of the target domain. The proposed network uses the training data of the target domain with restricted size and the full training data of the source domain, yet provided better performance than using the full training data in a single domain setting. Secondly, the proposed method can be used for small datasets. Lastly, the proposed multidomain learning network reduces the training time by rendering a generic model for related domains compared to fitting a model for each domain separately. In addition, the thesis also proposes a self-supervised model to reduce the need for a considerable amount of annotated human activity data. The self-supervised method is pre-trained on the unlabeled data and fine-tuned on a small amount of labelled data for supervised learning. The proposed self-supervised pre-training network renders human activity representations that are semantically meaningful and provides a good initialization for supervised fine tuning. The developed network enhances the performance of human activity recognition in addition to minimizing the need for a considerable amount of labelled data. The proposed models are evaluated by multiple public and benchmark datasets of sensorbased human activities and compared with the existing state-of-the-art methods. The experimental results show that the proposed networks boost the performance of human activity recognition systems

    Emotion and polarity prediction from Twitter

    Get PDF
    Classification of public information from microblogging and social networking services could yield interesting outcomes and insights into the social and public opinions towards different services, products, and events. Microblogging and social networking data are one of the most helpful and proper indicators of public opinion. The aim of this paper is to classify tweets to their classes using cross validation and partitioning the data across cities using supervised machine learning algorithms. Such an approach was used to collect real time Twitter microblogging data tweets towards mentioning iPad and iPhone in different locations in order to analyse and classify data in terms of polarity: positive or negative, and emotion: anger, joy, sadness, disgust, fear, and surprise. We have collected over eighty thousand tweets that have been pre-processed to generate document level ground-truth and labelled according to Emotion and Polarity. We also compared some approaches in order to measures the performance of K-NN, Nave Bayes, and SVM classifiers. We found that the K-NN, Nave Bayes, SVM, and ZeroR have a reasonable accuracy rates, however, the K-NN has outperformed the Nave Bayes, SVM, and ZeroR based on the achieved accuracy rates and trained model time. The K-NN has achieved the highest accuracy rates 96.58% and 99.94% for the iPad and iPhone emotion data sets using cross validation technique respectively. Regarding partitioning the data per city, the K-NN has achieved the highest accuracy rates 98.8% and 99.95% for the iPad and iPhone emotion data sets respectively. Regarding the polarity data sets using both cross validation and partitioning data per city, the K-NN achieved 100% for the all polarity datasets

    Overview of Human Activity Recognition Using Sensor Data

    Full text link
    Human activity recognition (HAR) is an essential research field that has been used in different applications including home and workplace automation, security and surveillance as well as healthcare. Starting from conventional machine learning methods to the recently developing deep learning techniques and the Internet of things, significant contributions have been shown in the HAR area in the last decade. Even though several review and survey studies have been published, there is a lack of sensor-based HAR overview studies focusing on summarising the usage of wearable sensors and smart home sensors data as well as applications of HAR and deep learning techniques. Hence, we overview sensor-based HAR, discuss several important applications that rely on HAR, and highlight the most common machine learning methods that have been used for HAR. Finally, several challenges of HAR are explored that should be addressed to further improve the robustness of HAR

    Towards Reliable, Stable and Fast Learning for Smart Home Activity Recognition

    No full text
    The current population age grows increasingly in industrialized societies and calls for more intelligent tools to monitor human activities.  The aims of these intelligent tools are often to support senior people in their homes, to keep track of their daily activities, and to early detect potential health problems to facilitate a long and independent life.  The recent advancements of smart environments using miniaturized sensors and wireless communications have facilitated unobtrusively human activity recognition.   Human activity recognition has been an active field of research due to its broad applications in different areas such as healthcare and smart home monitoring. This thesis project develops work on machine learning systems to improve the understanding of human activity patterns in smart home environments. One of the contributions of this research is to process and share information across multiple smart homes to reduce the learning time, reduce the need and effort to recollect the training data, as well as increase the accuracy for applications such as activity recognition. To achieve that, several contributions are presented to pave the way to transfer knowledge among smart homes that includes the following studies. Firstly, a method to align manifolds is proposed to facilitate transfer learning. Secondly, we propose a method to further improve the performance of activity recognition over the existing methods. Moreover, we explore imbalanced class problems in human activity recognition and propose a method to handle imbalanced human activities. The summary of these studies are provided below.  In our work, it is hypothesized that aligning learned low-dimensional  manifolds from disparate datasets could be used to transfer knowledge between different but related datasets. The t-distributed Stochastic Neighbor Embedding(t-SNE) is used to project the high-dimensional input dataset into low-dimensional manifolds. However, since t-SNE is a stochastic algorithm and  there is a large variance of t-SNE maps, a thorough analysis of the stability is required before applying  Transfer learning.  In response to this, an extension to Local Procrustes Analysis called Normalized Local Procrustes Analysis (NLPA) is proposed to non-linearly align manifolds by using locally linear mappings to test the stability of t-SNE low-dimensional manifolds. Experiments show that the disparity from using NLPA to align low-dimensional manifolds decreases by order of magnitude compared to the disparity obtained by Procrustes Analysis (PA). NLPA outperforms PA and provides much better alignments for the low-dimensional manifolds. This indicates that t-SNE low-dimensional manifolds are locally stable, which is the part of the contribution in this thesis. Human activity recognition in smart homes shows satisfying recognition results using existing methods. Often these methods process sensor readings that precede the evaluation time (where the decision is made) to evaluate and deliver real-time human activity recognition. However, there are several critical situations, such as diagnosing people with dementia where "preceding sensor activations" are not always sufficient to accurately recognize the resident's daily activities in each evaluated time. To improve performance, we propose a method that delays the recognition process to include some sensor activations that occur after the point in time where the decision needs to be made. For this, the proposed method uses multiple incremental fuzzy temporal windows to extract features from both preceding and some oncoming sensor activations. The proposed method is evaluated with two temporal deep learning models: one-dimensional convolutional neural network (1D CNN) and long short-term memory (LSTM) on a binary sensor dataset of real daily living activities.  The experimental evaluation shows that the proposed method achieves significantly better results than the previous state-of-the-art.  Further, one of the main problems of activity recognition in a smart home setting is that the frequency and duration of human activities are intrinsically imbalanced. The huge difference in the number of observations for the categories means that many machine learning algorithms focus on the classification of the majority examples due to their increased prior probability while ignoring or misclassifying minority examples. This thesis explores well-known class imbalance approaches (synthetic minority over-sampling technique, cost-sensitive learning and ensemble learning) applied to activity recognition data with two temporal data pre-processing for the deep learning models LSTM and 1D CNN. This thesis proposes a data level perspective combined with a temporal window technique to handle imbalanced human activities from smart homes in order to make the learning algorithms more sensitive to the minority class. The experimental results indicate that handling imbalanced human activities from the data-level outperforms algorithm level and improved the classification performance

    Classification of Emotion and Polarity from Twitter Data

    No full text
    Classification of public information from microblogging and social networking services could yield interesting outcomes and insights into the social public opinions towards different services and products. Microblogging and social networking data is one of the most helpful and proper indicators of public opinion. Hence, in this research real time Twitter microblogging data towards iPad and iPhone have been collected in different locations in order to analyse and classify data in terms of polarity: positive or negative, and emotion: anger, joy, sadness, disgust, fear, and surprise. After that the collected tweets have been pre-processed to generate document level ground truth. Supervised machine learning algorithms have been used to classify tweets to their classes using cross validation and partitioning the data across cities. The performance measures of the classifiers have been considered to identify suitable algorithm for the data sets. It was found that the K-NN, Naïve Bayes, and SVM have a reasonable accuracy rates, however, the K-NN has outperformed the Naïve Bayes, SVM, and ZeroR based on the achieved accuracy rates and trained model time. The K-NN has achieved the highest accuracy rates 96.58 % and 99.94 % for the iPad and iPhone emotion data sets using cross validation technique respectively. Regarding partitioning the data per city, the K-NN has achieved the highest accuracy rates 98.8% and 99.95% for the iPad and iPhone emotion data sets respectively. Regarding the polarity data sets using both cross validation and partitioning data per city the K-NN achieved 100% for the all polarity data sets

    Towards Reliable, Stable and Fast Learning for Smart Home Activity Recognition

    No full text
    The current population age grows increasingly in industrialized societies and calls for more intelligent tools to monitor human activities.  The aims of these intelligent tools are often to support senior people in their homes, to keep track of their daily activities, and to early detect potential health problems to facilitate a long and independent life.  The recent advancements of smart environments using miniaturized sensors and wireless communications have facilitated unobtrusively human activity recognition.   Human activity recognition has been an active field of research due to its broad applications in different areas such as healthcare and smart home monitoring. This thesis project develops work on machine learning systems to improve the understanding of human activity patterns in smart home environments. One of the contributions of this research is to process and share information across multiple smart homes to reduce the learning time, reduce the need and effort to recollect the training data, as well as increase the accuracy for applications such as activity recognition. To achieve that, several contributions are presented to pave the way to transfer knowledge among smart homes that includes the following studies. Firstly, a method to align manifolds is proposed to facilitate transfer learning. Secondly, we propose a method to further improve the performance of activity recognition over the existing methods. Moreover, we explore imbalanced class problems in human activity recognition and propose a method to handle imbalanced human activities. The summary of these studies are provided below.  In our work, it is hypothesized that aligning learned low-dimensional  manifolds from disparate datasets could be used to transfer knowledge between different but related datasets. The t-distributed Stochastic Neighbor Embedding(t-SNE) is used to project the high-dimensional input dataset into low-dimensional manifolds. However, since t-SNE is a stochastic algorithm and  there is a large variance of t-SNE maps, a thorough analysis of the stability is required before applying  Transfer learning.  In response to this, an extension to Local Procrustes Analysis called Normalized Local Procrustes Analysis (NLPA) is proposed to non-linearly align manifolds by using locally linear mappings to test the stability of t-SNE low-dimensional manifolds. Experiments show that the disparity from using NLPA to align low-dimensional manifolds decreases by order of magnitude compared to the disparity obtained by Procrustes Analysis (PA). NLPA outperforms PA and provides much better alignments for the low-dimensional manifolds. This indicates that t-SNE low-dimensional manifolds are locally stable, which is the part of the contribution in this thesis. Human activity recognition in smart homes shows satisfying recognition results using existing methods. Often these methods process sensor readings that precede the evaluation time (where the decision is made) to evaluate and deliver real-time human activity recognition. However, there are several critical situations, such as diagnosing people with dementia where "preceding sensor activations" are not always sufficient to accurately recognize the resident's daily activities in each evaluated time. To improve performance, we propose a method that delays the recognition process to include some sensor activations that occur after the point in time where the decision needs to be made. For this, the proposed method uses multiple incremental fuzzy temporal windows to extract features from both preceding and some oncoming sensor activations. The proposed method is evaluated with two temporal deep learning models: one-dimensional convolutional neural network (1D CNN) and long short-term memory (LSTM) on a binary sensor dataset of real daily living activities.  The experimental evaluation shows that the proposed method achieves significantly better results than the previous state-of-the-art.  Further, one of the main problems of activity recognition in a smart home setting is that the frequency and duration of human activities are intrinsically imbalanced. The huge difference in the number of observations for the categories means that many machine learning algorithms focus on the classification of the majority examples due to their increased prior probability while ignoring or misclassifying minority examples. This thesis explores well-known class imbalance approaches (synthetic minority over-sampling technique, cost-sensitive learning and ensemble learning) applied to activity recognition data with two temporal data pre-processing for the deep learning models LSTM and 1D CNN. This thesis proposes a data level perspective combined with a temporal window technique to handle imbalanced human activities from smart homes in order to make the learning algorithms more sensitive to the minority class. The experimental results indicate that handling imbalanced human activities from the data-level outperforms algorithm level and improved the classification performance

    Efficacy of Imbalanced Data Handling Methods on Deep Learning for Smart Homes Environments

    No full text
    Human activity recognition as an engineering tool as well as an active research field has become fundamental to many applications in various fields such as health care, smart home monitoring and surveillance. However, delivering sufficiently robust activity recognition systems from sensor data recorded in a smart home setting is a challenging task. Moreover, human activity datasets are typically highly imbalanced because generally certain activities occur more frequently than others. Consequently, it is challenging to train classifiers from imbalanced human activity datasets. Deep learning algorithms perform well on balanced datasets, yet their performance cannot be promised on imbalanced datasets. Therefore, we aim to address the problem of class imbalance in deep learning for smart home data. We assess it with Activities of Daily Living recognition using binary sensors dataset. This paper proposes a data level perspective combined with a temporal window technique to handle imbalanced human activities from smart homes in order to make the learning algorithms more sensitive to the minority class. The experimental results indicate that handling imbalanced human activities from the data-level outperforms algorithms level and improved the classification performance. © The Author(s) 2020Funding: Open access funding provided by Halmstad University. This research is supported by the Knowledge Foundation under the project of the Center for Applied Intelligent Systems, under Grant Agreement No. 20100271.</p

    Stability analysis of the t-SNE algorithm for human activity pattern data

    No full text
    Health technological systems learning from and reacting on how humans behave in sensor equipped environments are today being commercialized. These systems rely on the assumptions that training data and testing data share the same feature space, and residing from the same underlying distribution - which is commonly unrealistic in real-world applications. Instead, the use of transfer learning could be considered. In order to transfer knowledge between a source and a target domain these should be mapped to a common latent feature space. In this work, the dimensionality reduction algorithm t-SNE is used to map data to a similar feature space and is further investigated through a proposed novel analysis of output stability. The proposed analysis, Normalized Linear Procrustes Analysis (NLPA) extends the existing Procrustes and Local Procrustes algorithms for aligning manifolds. The methods are tested on data reflecting human behaviour patterns from data collected in a smart home environment. Results show high partial output stability for the t-SNE algorithm for the tested input data for which NLPA is able to detect clusters which are individually aligned and compared. The results highlight the importance of understanding output stability before incorporating dimensionality reduction algorithms into further computation, e.g. for transfer learning.SA3

    Emotion and polarity prediction from Twitter

    No full text
    Classification of public information from microblogging and social networking services could yield interesting outcomes and insights into the social and public opinions towards different services, products, and events. Microblogging and social networking data are one of the most helpful and proper indicators of public opinion. The aim of this paper is to classify tweets to their classes using cross validation and partitioning the data across cities using supervised machine learning algorithms. Such an approach was used to collect real time Twitter microblogging data tweets towards mentioning iPad and iPhone in different locations in order to analyse and classify data in terms of polarity: positive or negative, and emotion: anger, joy, sadness, disgust, fear, and surprise. We have collected over eighty thousand tweets that have been pre-processed to generate document level ground-truth and labelled according to Emotion and Polarity. We also compared some approaches in order to measures the performance of K-NN, Nave Bayes, and SVM classifiers. We found that the K-NN, Nave Bayes, SVM, and ZeroR have a reasonable accuracy rates, however, the K-NN has outperformed the Nave Bayes, SVM, and ZeroR based on the achieved accuracy rates and trained model time. The K-NN has achieved the highest accuracy rates 96.58% and 99.94% for the iPad and iPhone emotion data sets using cross validation technique respectively. Regarding partitioning the data per city, the K-NN has achieved the highest accuracy rates 98.8% and 99.95% for the iPad and iPhone emotion data sets respectively. Regarding the polarity data sets using both cross validation and partitioning data per city, the K-NN achieved 100% for the all polarity datasets
    corecore